Defining Words with Words: Beyond the Distributional Hypothesis
نویسندگان
چکیده
The way humans define words is a powerful way of representing them. In this work, we propose to measure word similarity by comparing the overlap in their definition. This highlights linguistic phenomena that are complementary to the information extracted from standard context-based representation learning techniques. To acquire a large amount of word definitions in a cost-efficient manner, we designed a simple interactive word game, Word Sheriff. As a byproduct of game play, it generates short word sequences that can be used to uniquely identify words. These sequences can not only be used to evaluate the quality of word representations, but it could ultimately give an alternative way of learning them, as it overcomes some of the limitations of the distributional hypothesis. Moreover, inspecting player behaviour reveals interesting aspects about human strategies and knowledge acquisition beyond those of simple word association games, due to the conversational nature of the game. Lastly, we outline a vision of a communicative evaluation setting, where systems are evaluated based on how well a given representation allows a system to communicate with human and computer players.
منابع مشابه
Distributional semantics beyond words: Supervised learning of analogy and paraphrase
There have been several efforts to extend distributional semantics beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, contiguous or noncontiguous). One way to extend beyond words is to compare two tuples using a function that combines pairwise similarities between the component words in the tuples. A strength of this...
متن کاملContextual Information in Semantic Space Models Beyond Words and Documents
Since synonyms are important lexical knowledge, various methods have been proposed for automatic synonym acquisition. Whereas most of the methods are based on the distributional hypothesis and utilize contextual clues, little attention has been paid to what kind of contextual information is useful for the purpose. As one of the ways to augment contextual information, we propose the use of indir...
متن کاملSixth International and Interdisciplinary Conference on Modeling and Using Context Workshop on Contextual Information in Semantic Space Models Beyond Words and Documents
Since synonyms are important lexical knowledge, various methods have been proposed for automatic synonym acquisition. Whereas most of the methods are based on the distributional hypothesis and utilize contextual clues, little attention has been paid to what kind of contextual information is useful for the purpose. As one of the ways to augment contextual information, we propose the use of indir...
متن کاملThe distributional hypothesis∗
Distributional approaches to meaning acquisition utilize distributional properties of linguistic entities as the building blocks of semantics. In doing so, they rely fundamentally on a set of assumptions about the nature of language and meaning referred to as the distributional hypothesis. This hypothesis is often stated in terms like “words which are similar in meaning occur in similar context...
متن کاملWord meaning in context: a probabilistic model and its application to question answering
The need for assessing similarity in meaning is central to most language technology applications. Distributional methods are robust, unsupervised methods which achieve high performance on this task. These methods measure similarity of word types solely based on patterns of word occurrences in large corpora, following the intuition that similar words occur in similar contexts. As most Natural La...
متن کامل